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Abstract. - The typical behavior of optimal solutions to portfolio optimization problems with 
absolute deviation and expected shortfall models using replica analysis was pioneeringly estimated 
by S. Ciliberti and M. Mezard [Eur. Phys. B. 57, 175 (2007)]; however, they have not yet 
developed an approximate derivation method for finding the optimal portfolio with respect to 
a given return set. In this study, an approximation algorithm based on belief propagation for 
the portfolio optimization problem is presented using the Bethe free energy formalism, and the 
consistency of the numerical experimental results of the proposed algorithm with those of replica 
analysis is confirmed. Furthermore, the conjecture of H. Konno and H. Yamazaki, that the 
optimal solutions with the absolute deviation model and with the mean-variance model have the 
same typical behavior, is verified using replica analysis and the belief propagation algorithm. 



Introduction. — Portfolio optimization is one of the 
most fundamental frameworks of risk diversification man- 
agement. Its theory was introduced by Markowitz in 1959 
and is one of the most important areas being actively in- 
vestigated in financial engineering [1-3]. In their theo- 
retical research, Ciliberti and Mezard assessed the typi- 
cal behavior of optimal solutions to portfolio optimization 
problems, in particular those described by the absolute de- 
viation and expected shortfall models, using replica anal- 
ysis, one of the most powerful approaches in disordered 
systems. With this approach, they showed that the phase 
transitions of these optimal solutions were nontrivial [2]. 
However, they did not develop an effective algorithm for 
finding the optimal portfolio with respect to a fixed re- 
turn set. This requires a rapid algorithm for resolving the 
optimal portfolio problem with respect to a large enough 
in-sample set. 

As a first step in such a research direction, we propose 
an algorithm based on belief propagation, which is well- 
known as one of the most prominent algorithms in proba- 
bilistic inference, to resolve the microscopic averages of the 
optimal solution in a feasible amount of time for a fixed 
return set. We also confirm whether the numerical exper- 
imental results of our novel algorithm are consistent with 
the ones of replica analysis. Furthermore, the conjecture 



of Konno and Yamazaki, that if the return at each period 
is independently and identically drawn from the normal 
probability distribution [3], the optimal portfolio of the 
mean-variance model is consistent with that of the abso- 
lute deviation model, is supported using replica analysis 
and belief propagation. 

Model Setting. — Let us define the model setting for 
our discussion. A portfolio of N assets and the return at 
period fj, are represented by w = {w\ ,W2,- •• ,wn 
and — {xi fl ,X2 f _ l , ■ ■ • , xnh} T G R*, respectively, where 
Wk is the position of asset k, and we assume for simplicity 
that the mean of the return of asset k in period /i, Xk^,, 
is zero. The notation T indicates matrix transposition. 
Given a return set for p periods as reference, the problem is 
to minimize the following cost function (i.e., Hamiltonian) 
for the portfolio: 



H(w) 



(1) 



where R(u) represents a cost function, such as ^- in the 
mean- variance model and \u\ in the absolute deviation 
model, respectively. Furthermore, since the budget is as- 
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sumed to be finite, the following global constraint is set: reducibility condition of Eq. (5). By adding the term 

to the 



N 



k=l 



N. 



(2) 



One of our aims is to develop an effective general algo- 
rithm for solving this problem; in particular, our aim is 
an algorithm that works for all cost functions R(u) and 
all probability distributions of the returns. 

As a basis for the proposed algorithm, following exam- 
ples in statistical mechanics, we set the joint probability 
of portfolio w used in Eq. (1) using finite inverse absolute 
temperature j3 as follows: 



P(w) cx P (w)exp[-j3H (w)} 



p r 



oc 



n 



Po{w)g 



PfW, (3) 



where g(u) = e @ R ( U > is the likelihood function and 



cxp 



prior probability Pq (w) 

for sufficiently large N. Notice that the 
tition function of this posterior probability Z 



N 
par 



E^ITL i Po(w)g 



pl-p 

M) 



(w) is implicitly ignored 



1^=1 »\-« y ^ 
in this analysis because intuitively it is possible to eval- 
uate the first- and second-order moments of portfolio w k 
approximately without the partition function by the fol- 
lowing procedure. An arbitrary test probability of portfo- 
lio is defined as follows: 

p N 

Q{w) cx nM™>iK~>*)' w 

|U=1 fe=l 

where the reducibility condition on beliefs b k (w k ) and 



w\w k 



(5) 



must hold and w\w k denotes a subset of w from which 
w k is excluded. The Kullback-Liebler divergence (KLD) 
KL(Q\P) — J2w Q(™) lo S T^F) provides a useful guideline 
for deriving the belief propagation algorithm. However, 
since it is too complicated to directly assess KLD except in 
specific graphical models, we here approximate the Bethe 
free energy denoted as follows: 

+ (. -„g£«.*>*(£&)<«> 

where Pok(wk) oc e™' is used. The purpose of this step 
is to derive the optimal portfolio using the beliefs b k {w k ) 
and bfj.(w) that minimize the Bethe free energy under the 



right-hand side of Eq. (6), it is possible to maximize the 
Bethe free energy with respect to the; beliefs to obtain 



b k (w k ) cx P ok (w k )exp 
b^(w) cx P (w)g 



1 P 



-T -* 
W X, 



exp 



N 



A fcAt (w fc ) 



Furthermore, for simplicity, we set \ k ^{w k ) = 
T^E^i^K) + ^kn(w k ) as novel auxil- 
iary functions, and then b k (w k ) and b^{w) 
can be rewritten using E^=i ^kn(wk) = 

E^=i^( w fc) and \ kfi (w k ) = -E^) hu(w k ) 
as b k (w k ) cx P ok (w k )exp E^=i ^k^Wk) 



exp 



and 



bn(w) cx P (w)g ^ VN 

Moreover, applying the cumulant generating functions 
MOk) = logJ2bk(wk)e Wk9k , 



Wk 

^{e) = log^^e^*, (8) 



the first and second moments of w k have the compact 

forms m , - d M ") - fln H v , - - 

torms m wk ~ 96fc - d6k and \wk - —g^ — - 



at 9 — {6i, • • • , 9n} — > 0. This allows us to disre- 

k 

gard the calculation of the partition function. Then, our 
proposed algorithm for sufficiently large N comprises the 
following: 



hwk — 
Xwk 

Xwk — 



= Xwk (h wk + m) . 



1 P 

7= ^ ^ ^k^t^ujl + XwkTTlwk? 
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/2 X kf±Xu[t, 
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Xwk 
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dh uu 



Xufx 
Xuji 



log / Dzg (z 

J — oo ^ 

1 N 

fjq ^ ' x kfj, m wk ~ X 



XufJ, 



N ^ 



fe=l 
N 



/ j X knXwk 
k=l 

d 2 



(9) 
(10) 

(11) 
(12) 
x), (13) 
(14) 

(15) 



log/ Dzg (zy/Z^ + hup) ,(16) 



where I?z = -j=e~~ is used. Note that if X kfJ- (w k ) is 
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redefined as A feM (w fe ) = -^w 2 k + h kfl w k , then Xtufe = 
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Eu=i7feM and h 



wk = E^=i^ I 4 ' 5 !- In addition, 
Xwkin<wk and Xupjn Uil describe the Onsager reaction terms 
in the literature of spin glass theory (respectively [6,7]). 

Four points should be noticed here. First, the calcula- 
tion of this procedure is reduced from 0(N 3 ) to 0(N 2 ). 
For instance, in the case of the mean-variance model, al- 
though we are required to calculate the inverse matrix 
of the correlation matrix of return set XX T £ Mnxn, 
where return matrix X = {x\, • • • , x p } £ M. nx p , in order 
to assess the optimal solution rigorously, it is well-known 
that this calculation is 0(N 3 ). Moreover, fortunately it is 
found that in the case of the mean-variance model, this al- 
gorithm derives the exact optimal solution (see appendix 
A for details). Second, only Eqs. (13) and (16) are depen- 
dent on the likelihood function g(u) = e~^ R ^ u \ and the 
variables of index u are the only model dependent ones. 
Furthermore, m is determined by Eqs. (2) and (9). Third, 
the randomness of return is not assumed to be sampled 
from specific distributions. Because it is plausible that 
the assumption on the Bethe free energy approximation 
works correctly if the return at each period is asymptoti- 
cally not correlated with other returns . Lastly, we expect 
that in the limit as f3 — > oo, the estimate of the portfolio of 
asset k, m w k, asymptotically corresponds to the optimal 
portfolio with respect to the given return set. 

Application. — In order to confirm the effectiveness 
of our method, the numerical experimental results of the 
proposed algorithm and those of the replica analysis for 
the case of the Markowitz model are shown in Figs. 1 and 
2, where arc independently and identically drawn from 
the normal distribution with mean and variance and 1, 
respectively. The numerical experimental result of belief 
propagation is assessed from 10 2 samples of the number 
of assets N — 100 and is denoted by error bars and the 
result of replica analysis is denoted by a solid line. Both 
findings indicate that the two approaches are consistent 
with each other. 

With regard to the conjecture of Konno and Yamazaki, 
the variables in Eqs. (13) and (16), in the case of the 
mean-variance model 



rn 



Ufl 



1 + PXun 



1 + PXun 

and the absolute deviation model 



(17) 
(18) 



rn 



/3tanh 



fih ufl + - log 




H(u) ~ (\/27ra) e "2 in the case of u » 1, m ufl ~ 
— and Xuu — are estimated; that is, this finding 
indicates that the conjecture of Konno and Yamazaki is 
valid in part in the sense of the belief propagation ap- 
proach. See appendices for details. 

Conclusion. — In conclusion, we have discussed an 
effective algorithm for finding the optimal solution of the 
portfolio optimization problem with respect to an arbi- 
trary cost function according to Cilibcrti and Mezard [2] . 
With loss of generality, applying the likelihood function 
g(u) defined by the cost function R(u) dependent on the 
risk diversification problem, we proposed a novel approxi- 
mation derivation method based on one of the most power- 
ful estimation methods in probabilistic inference. In addi- 
tion, since two types of Onsager reaction terms are derived 
in Eqs (10) and (14), our algorithm provides the Thouless, 
Anderson, and Palmer approach rather than the mean- 
field approximation in the literature of spin glass theory. 
One advantage of our algorithm is that it rapidly converges 
by excluding the effect of self-response. In order to con- 
firm the effectiveness of the proposed approach, we have 
described the case of the mean-variance model. Further- 
more, we have shown that the conjecture of Konno and 
Yamazaki is supported by employing both approaches de- 
veloped in cross-disciplinary research involving statistical 
mechanics and information sciences. In future work, we 
will assess the properties of R(u) and the randomness of 
return that make solving the portfolio optimization prob- 
lem using belief propagation possible. 
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Appendix A: Proof of Exactness. — We here con- 
firm the exactness of the proposed belief propagation al- 
gorithm for the case of the Markowitz model. Our discus- 
sion is restricted to a > 1 for simplicity. From Eqs. (14), 
(17), and (18), we obtain m u = —-^=X T m w , where rn u = 

{m ul , - ■ ■ ,m up } T £ R p and m w = {m wl , ■ ■ ■ , m wN } T £ 
H N . Furthermore, me = — — L=Am„ follows immediately 

from Eqs. (9), (10), and (12), where e = {1, • • • , 1} T £ 
R w . Thus, substituting m w = Nrh (/3AA T ) e into the 
constraint N — e^fhy, gives the exact optimal solution 

n(xx t )- 1 s 



rn 



Xu^ 



dm ufl 
dh U a ' 



(20) 



e T (XX T )- 1 e' 



are assessed exactly using H(u) = Dz. Because 



Appendix B: Replica Analysis. According to 
Ciliberti and Mezard and Varga-Hoszonits and Kondor 
[2,8], replica symmetry solution of the portfolio optimiza- 
tion problem, where Xk^ is independently and identically 
distributed with N(0, 1), is derived as the following ex- 
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trcmum: 



1 



9- 1 , 1 



= Extr 



2X 



logx 



£y log / Dzg (z^ + y^) L (21) 

-oo ./ — oo J 

where Z = p o(w) n£ = i 5 (^ylf ) is the partition 
function and the notation [• • •] denotes the quenched av- 
erage over the return set. Moreover, the quenched over- 
lap parameters become q a b = jj Efc=i w k a w k b = X + 9 
if a = b and q otherwise by employing replica indices 
a, 6 = 1, 2, • • • , n and the assumption of replica symmetry. 
Furthermore, for large N and p, a — p/N ~ 0(1) remains 
finite and plays an important role as a control parameter 
with respect to phase transition phenomena. If g(u) = 

e~2" 2 , then q = (l — ^) and \ = (P( a ~ 1)) 1 can be 
exactly calculated in the case a > 1 and q — > oo, and 
X — > oo otherwise. This analytical finding is also verified 
in by the following. It is well known that the eigenvalue 
distribution of the correlation matrix C — -^XX T in the 



limit of N — > oo is asymptotically close to the Marchenko- 

V[A~A_] + [A + -A] + 

n2 



with 
There- 



Pastur law p(X) = [1 — a] d(X) , 2itX 

and [u] + = max{u, 0} [9 j 
fore, q = (j^) (-^) and one degree of the cost function 
e = lim./v->oo jj [H(w)} = \ (j) 1 are obtained straight- 
forwardly using (/(A)) = J^ oQ dXp(X)f{X). Applying 
Marchencko-Pastur law, that (i) = A ~ rA ' - 



2tt 



\ \ 2 

1 

A + 



(Q-l)^ 



if 



and <£) 

a > 1 and approach infinity otherwise follows directly 
[10-13]. This is consistent with the findings of replica 
analysis. 

In general, the order parameters are derived as follows: 



X 

q 



arj ' 
1 + ax 2 5, 



Dyy 



( [ Dzg 1 (zy/x + yy/q)^ 

J — OO 



s = 



Dy 



/OO 
Dzg(zy/x + y^/q) j 

Dzg 1 (Zy/x + yy/q)^ 
) 



Dzg (zy/x + yx/q) 



(22) 
(23) 

(24) 



(25) 



From Eqs. (22) and (23), 



1 S 

a rj 2 




Fig. 1: The reference ratio a = p/N (horizontal axis) versus 
the quenched overlap parameter q (vertical axis). The numer- 
ical experimental results from the proposed algorithm (error 
bars) are assessed from 10 2 experiments using N = 100 assets. 
Comparing with the results of replica analysis (solid line), the 
effectiveness of proposed algorithm is verified. 




Fig. 2: The reference ratio a (horizontal axis) versus one degree 
of the cost function e (vertical axis). This result also indicates 
that the approximation approach based on probabilistic infer- 
ence works correctly. 



is obtained. In the limit of sufficiently large j3 of g(u) = 
e -0M 5 if we assess r\ ~ — ^ and 6 ~ asymptotically, 
then the conjecture of Konno and Yamazaki is confirmed 
as correct in the sense of replica analysis. 

Appendix C: The Conjecture of Konno and Ya- 
mazaki. — This conjecture is related to the assessment 
of an annealed system in the context of spin glass the- 
ory. If the return at period /1, x^, is independently 
and identically drawn from AT (0, £), where £ £ M.nxn 
is variance- covariance matrix and w is fixed, the novel 

-♦T — 

variable z — is distributed as N(0,s 2 (w)) with 

s 2 (w) = jjW T T l w £ R. With respect to fixed w, em- 
ploying one degree of the cost function of the annealed 



(26) optimization problem e(w) = E^=i R ( 
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a DuR (us(w)) , which becomes £mv(w) — fs 2 (w) 
in the case of the mean- variance model and sad(w) = 
-7==\s(w)\ in the absolute deviation model. This implies 
that the optimal portfolios of the annealed situations of 
the two models are consistent with each other. Note 
that one degree of the cost function in the case of the 
annealed portfolio problem with the expected shortfall 

model, £ E s(w) = min^o a {try + H (^j) } with 7 > 
can also be assessed [14]. If s(w) < ^i- , then this opti- 
mal solution is identical to those of the previous mentioned 
models. This finding, that is, argmin,j;T^ =w £mv(w) = 
argmin,jT ?=A r £ad(w), is one part of the contributions re- 
ported by Konno and Yamazaki. 

However, they optimistically assumed wmv — wad with 
respect to a given return set X without any mathematical 
proof, using 
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p N N 



wmv = arg min — V V V w i w k x i ^x kfl ,(27) 



WAD 



= arg min > 

w T e=N ^— ' 



fi— 1 i— 1 k— 1 

N 



(28) 



As explained above, arg min^Tg-^jv £ mv(w) = 
&rgminrfTg =N £ad(w) with respect to the annealed opti- 
mization problem strictly holds; however, wmv = wad 
is not always satisfied. For example, in the simple case 
of N = p = 2 for the two returns x\ = {a, c} T and 
X2 = {b,d} T , their assumption wmv = wad docs not 
hold, except under specific special situations. 

Although this is apparently contradictory to these ob- 
tained findings from both approaches, it is necessary to 
recognize that the relation wmv = wad with a fixed re- 
turn set is equivalent to the sufficient condition qMV = 
<7ad, where q M v = lim^oo [wmv^mv] and <7ad = 
lim 



N- 



j; [wX D W AD ] 



are quenched averages of overlap 



parameters. Moreover, although wmv = wad does not 
hold in general, it is expected that the inner product 

— T — 

EgffiB is a PP roximatel y 1 because £ x klx x jv -> 

in the case of sufficiently large N [15]. 
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